Arabic natural language processing: An overview

نویسندگان

چکیده

Arabic is recognised as the 4th most used language of Internet. has three main varieties: (1) classical (CA), (2) Modern Standard (MSA), (3) Dialect (AD). MSA and AD could be written either in or Roman script (Arabizi), which corresponds to with Latin letters, numerals punctuation. Due complexity this number corresponding challenges for NLP, many surveys have been conducted, order synthesise work done on Arabic. However these principally focus two varieties (MSA AD, letters only), they are slightly old (no such survey since 2015) therefore do not cover recent resources tools. To bridge gap, we propose a focusing 90 research papers (74% were published after 2015). Our study presents classifies Arabic, by concentrating both Arabizi, associates each its publicly available whenever available.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An overview of empirical natural language processing.(Natural Language

In recent years, there has been a resurgence in research on empirical methods in natural language processing. These methods employ learning techniques to automatically extract linguistic knowledge from natural language corpora rather than require the system developer to manually encode the requisite knowledge. The current special issue reviews recent research in empirical methods in speech reco...

متن کامل

An Overview of Empirical Natural Language Processing

search on empirical methods in natural language processing. These methods employ learning techniques to automatically extract linguistic knowledge from natural language corpora rather than require the system developer to manually encode the requisite knowledge. The current special issue reviews recent research in empirical methods in speech recognition, syntactic parsing, semantic processing, i...

متن کامل

Arabic Natural Language Processing for Information Retrieval

Human Language Technology has played a big role in implementing Latin based information retrieval systems. Two of the most sited techniques are stemming and truncation. Numerous studies have showed that the inflectional structure of words has a big impact on the retrieval accuracy of Latin-based languages information retrieval systems (IRS). Stemming or truncation is done for two principal reas...

متن کامل

Introduction to Arabic Natural Language Processing

This book provides system developers and researchers in natural language processing and computational linguistics with the necessary background information for working with the Arabic language. The goal is to introduce Arabic linguistic phenomena and review the state-of-the-art in Arabic processing. The book discusses Arabic script, phonology, orthography, morphology, syntax and semantics, with...

متن کامل

An Overview of Probabilistic Tree Transducers for Natural Language Processing

Probabilistic finite-state string transducers (FSTs) are extremely popular in natural language processing, due to powerful generic methods for applying, composing, and learning them. Unfortunately, FSTs are not a good fit for much of the current work on probabilistic modeling for machine translation, summarization, paraphrasing, and language modeling. These methods operate directly on trees, ra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of King Saud University - Computer and Information Sciences

سال: 2021

ISSN: ['2213-1248', '1319-1578']

DOI: https://doi.org/10.1016/j.jksuci.2019.02.006